Era-aware AI vocabulary breakdown + speculative gap-filling pattern by philippdubach · Pull Request #111 · blader/humanizer

philippdubach · 2026-05-01T16:53:43Z

Summary

Two narrowly scoped updates sourced from the current revision of Wikipedia: Signs of AI writing (revision fetched 2026-05-01).

§7 (AI Vocabulary): Replaces the flat high-frequency word list with the era-specific clusters now documented on the wiki page (GPT-4 / GPT-4o / GPT-5 eras). Adds bolstered and meticulous/meticulously to the master list, plus a one-line caveat about literal vs figurative usage (e.g., underscore as a literal underline, delve in geology).
§21 (renamed to "Knowledge-Cutoff Disclaimers and Speculative Gap-Filling"): Covers the newer retrieval-augmented pattern where a model, having failed to find a source, writes a paragraph about not having found one and then speculates that the subject "maintains a low profile" or "keeps personal details private." Adds a second before/after example for the gap-filling case.
README: Tightens the §21 row label to reflect both subpatterns.

No new patterns; pattern count stays at 29. No version bump — happy to defer that to whatever coordination you do with the open v2.6.0 PRs (#85, #98).

Test plan

Diff is two files; SKILL.md and README.md
§7 keeps its existing Before/After example unchanged
§21 keeps its existing Before/After example as the cutoff-disclaimer case, and adds a separate gap-filling Before/After
Pattern numbering and section anchors are unchanged
Skill loads in Claude Code with no parse errors

Source: Wikipedia:Signs of AI writing — see "High density of AI vocabulary words" and "Knowledge-cutoff disclaimers and speculation about gaps in sources" sections.

…tern Two changes sourced from Wikipedia: Signs of AI writing (revision fetched 2026-05-01). §7 (AI Vocabulary): replace the flat high-frequency word list with the era-specific clusters now documented on the wiki page (GPT-4 / GPT-4o / GPT-5 eras). Add 'bolstered' and 'meticulous/meticulously' to the master list, and a one-line caveat about literal vs figurative usage. §21 (renamed to "Knowledge-Cutoff Disclaimers and Speculative Gap-Filling"): cover the newer retrieval-augmented pattern where the model, having failed to find a source, writes a paragraph about not having found one and then speculates that the subject "maintains a low profile" or "keeps personal details private." Adds a second before/after example for the gap-filling case. README: tighten the §21 row label to reflect both subpatterns. No version bump (leaving that to the maintainer to coordinate with the open v2.6.0 PRs). No new patterns; pattern count stays at 29.

Brings the fork's main branch in line with the maintained local v2.6.0, consolidating the changes that are also opened as focused PRs against blader/humanizer (blader#111, blader#112, blader#113): - §7 expanded with era-specific AI vocabulary clusters (GPT-4 / GPT-4o / GPT-5 eras), plus 'bolstered' and 'meticulous' added to the master list and a literal-vs-figurative caveat. - §21 renamed to "Knowledge-Cutoff Disclaimers and Speculative Gap-Filling"; covers the retrieval-augmented "maintains a low profile" / "keeps personal details private" speculation pattern. - New patterns §30-34: reference-markup artifacts (turn0search0, oaicite, utm_source=chatgpt.com, etc.), placeholder leftovers, Markdown/wikitext contamination, formal "Conclusion" closers, didactic disclaimers. - New Detection Guidance group: what NOT to flag (false positives), signs of human writing to preserve, and per-model LLM idiolects. Frontmatter version bumped to 2.6.0. README pattern table updated (29 → 34 patterns) with a new Artifacts and Contamination section and a pointer to Detection Guidance. WARP.md count corrected from the stale "25 patterns" to 34. Sourced from Wikipedia: Signs of AI writing (revision fetched 2026-05-01).

- Add DETECTION GUIDANCE section (false positives, human-writing signs, LLM idiolects) so editors know what NOT to flag (PR blader#113) - Add Tier-1 AI-iness density pre-flight in Full mode; auto-drops to Quick when density = 0 to protect human-first drafts (PR blader#115 adapted) - Expand blader#7 with era-specific vocabulary clusters (GPT-4 / GPT-4o / GPT-5 eras) and figurative-vs-literal caveat (PR blader#111) - Expand blader#9 with "rather than" dismissals + on-the-table test (PR blader#85) - Expand blader#14 with paired em dash bracketing + 4 fix options (PR blader#85) - Expand blader#21 with speculative gap-filling ("maintains a low profile" template detection) (PR blader#111) - Expand blader#23 with three more didactic disclaimers (subsumes pattern 34 from PR blader#112) - Expand blader#25 with structural "## Conclusion" section note - Add pattern blader#35 Debunking-Pose Headings -- heading-level AI tells that slip through prose-only passes (PR blader#116) - Add patterns blader#36 Conditional Frame Stacking and blader#37 Miscalibrated Epistemic Confidence (PR blader#85) - Add patterns blader#38 Reference-Markup Artifacts, blader#39 Phrasal Templates / Placeholder Text, and blader#40 Markdown / Wikitext Contamination -- three chat-UI copy-paste tells that confirm AI involvement (PR blader#112) - Extend domain overrides for blader#35-37; blader#38-40 are universal - Extend final AI audit from 9 to 13 points - README: pattern count 34 -> 40, three new section rows, updated fork-differentiator table, 3.2.0 version-history entry Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- §14: turn em-dash "overuse" into a hard cut (no em or en dashes in the final rewrite), with a replacement ladder and a final scan. Idea from #96. - §21: expand to cover speculative gap-filling ("maintains a low profile," "keeps personal details private") where a model invents filler instead of saying a source is missing. Idea from #111. - New pattern #30, diff-anchored writing: describe the thing as it is, not as a narration of what changed. Idea from #81. Hand-ported lean versions rather than merging the source PRs. 30 patterns total; README and AGENTS.md updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

blader · 2026-05-27T02:56:06Z

Adopted the speculative gap-filling half in v2.7.0 — §21 now covers 'maintains a low profile / keeps personal details private' as unsourced filler. Left out the era-vocabulary taxonomy on purpose (it's forensic dating that ages quickly; we just removed a similar model-fingerprinting section). Thanks for surfacing the gap-filling tell.

philippdubach mentioned this pull request May 1, 2026

Add Detection Guidance: false positives, human-writing signs, LLM idiolects #113

Merged

4 tasks

blader closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Era-aware AI vocabulary breakdown + speculative gap-filling pattern#111

Era-aware AI vocabulary breakdown + speculative gap-filling pattern#111
philippdubach wants to merge 1 commit into
blader:mainfrom
philippdubach:era-vocab-and-gap-filling

philippdubach commented May 1, 2026

Uh oh!

blader commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

philippdubach commented May 1, 2026

Summary

Test plan

Uh oh!

blader commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants